Relating the new language models of information retrieval to the traditional retrieval models
نویسنده
چکیده
During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper relates the retrieval algorithms suggested by these approaches to widely accepted retrieval algorithms developed within three traditional models of information retrieval: the Boolean model, the vector space model and the probabilistic model. The paper shows the existence of efficient retrieval algorithms that only use the matching terms in their computation. Under these conditions, the language models of information retrieval are surprisingly similar to both tf.idf term weighting as developed for the vector space model and relevance weighting as developed in the traditional probabilistic model. The paper suggests a new method for relevance weighting and a new method to rank documents giving Boolean queries. Experimental results on the TREC collection indicate that the language modelling approach outperforms the three traditional approaches.
منابع مشابه
Improved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملQEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches
A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...
متن کاملکاربست مدل بازیابی تخصص برای یافتن نویسندگان خبره
This research applied Expertise Retrieval model for finding expert authors, and used evaluation methods of Information Retrieval systems for measuring the performance of those models. Current research is an experimental one. Besides, a variety of methods including survey method has been used in the research process. Various models were developed for finding expert authors, all built on a known ...
متن کاملFactors Affecting Student's Scientific Information Retrieval based on Fuzzy Logic Method Compared to Traditional Method
Background and aim: The aim of this study was to identify the factors affecting on students' performance in information retrieval based on fuzzy logic method compared to traditional method. Materials and methods: This survey-descriptive study was performed using quantitative approach. The research population was 34 PhD students, and the researcher-made questionnaire was used. Data were analyzed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000